Criminaliteit en Welvaart#

Student names: Chris Dukker, David Snoeks, Ryan Rodrigus, Jason Roes

Team number: D2

# Load image from link
url = 'https://ccv-secondant.nl/fileadmin/w/secondant_nl/platform/illustraties/Internationaal.jpg'

# Display image from URL with smaller size and subtitle
from IPython.display import Image, display

# Set the desired image width and height
width = 600
height = 300

# Set the subtitle text
subtitle = "© CCV / Hans Sprangers"

# Create an Image instance with the URL
image = Image(url=url, width=width, height=height)

# Display the image and subtitle
display(image)
print(subtitle)
© CCV / Hans Sprangers

Introduction#

Europa is, over het algemeen genomen, welvarend. Meerdere van de sterkste economieën op aarde bevinden zich op dit continent, en de EU kan zich in economische termen meten aan andere grootmachten. Deze welvaart is alleen niet gelijkmatig verdeeld. Sommige landen zijn welvarender dan andere, maar ook binnen de landen bestaat er economische ongelijkheid doordat de welvaart verschillend is verdeeld. Ook zijn er verscillen in de criminaliteit. Landen hebben te kampen met verschillende hoeveelheden misdaad, en niet ieder land heeft last van dezelfde soorten criminaliteit. De vraag waar wij ons in dit datastory mee bezig houden is: is er een verband te vinden in (de verdeling van) welvaart van een land, en de hoeveelheid illegale activiteit in dit land?

Met deze data story onderzoeken wij of welvaartsongelijkheid en het gemiddelde inkomen per persoon invloed hebben op de hoeveelheid gepleegde misdaden, en of deze variabelen een sterkere correlatie vertonen met bepaalde categorieën misdaad zoals moord, verkrachting, diefstal en fraude. Dit doen we met behulp van gegevens van World Bank Group over de GINI-coëfficient (een maatstaf voor inkomens- of vermogenongelijkheid) en economische statistieken en groei dataset van World Bank Open Data en de misdaadstatistieken dataset van Eurostat.

Hieronder kan geschrapt worden? Volgens De Courson & Nettle (2021) is voor mensen met een laag inkomen en kapitaal de criminaliteit de beste manier om hun kwaliteit van leven te verbeteren. Hoewel er het risico is om gepakt te worden, is de mogelijke winst bij succes dit risico waard. Vanwege het kleine toekomstperspectief is misdaad voor deze bevolkingsgroep de beste manier om hun leven te verbeteren. Volgens ditzelfde onderzoek leidt een grote ongelijkheid tot meer criminaliteit, terwijl een eerlijkere verdeling van welvaart positieve effecten heeft en de mogelijke voordelen van misdaad verkleint.

Meer misdaad door welvaartsongelijkheid#

Bla bla, Criminaliteit en welvaart (ongelijkheid) hebben een sterk verband, want de argumenten hier onder.

Verband diefstal, fraude en ongelijkheid#

Correlatie 1 met grafiek

At solmen va esser necessi far uniform grammatica, pronunciation e plu sommun paroles. Ma quande lingues coalesce, li grammatica del resultant lingue es plu simplic e regulari quam ti del coalescent lingues. Li nov lingua franca va esser plu simplic e regulari quam li existent Europan lingues. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante.

Figure 2: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante.

The Second Argument of Your First Perspective#

It va esser tam simplic quam Occidental in fact, it va esser Occidental. A un Angleso it va semblar un simplificat Angles, quam un skeptic Cambridge amico dit me que Occidental es. Li Europan lingues es membres del sam familie. Lor separat existentie es un myth. Por scientie, musica, sport etc, litot Europa usa li sam vocabular. Li lingues differe solmen in li grammatica, li pronunciation e li plu commun vocabules. Omnicos directe al desirabilite de un nov lingua franca: On refusa continuar payar custosi traductores. At solmen va esser necessi far uniform grammatica, pronunciation e plu sommun paroles.

Hide code cell source
import plotly.graph_objects as go
import pandas as pd

Figure 3: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

Li Europan lingues es membres del sam familie. Lor separat existentie es un myth. Por scientie, musica, sport etc, litot Europa usa li sam vocabular. Li lingues differe solmen in li grammatica, li pronunciation e li plu commun vocabules. Omnicos directe al desirabilite de un nov lingua franca: On refusa continuar payar custosi traductores.

Your Second Perspective#

Bla bla, Criminaliteit en welvaart (ongelijkheid) hebben geen sterk verband/ er zijn belangrijke factoren, want de argumenten hier onder.

The First Argument of Your Second Perspective#

Hoewel je misschien zou denken dat er in armere landen meer misdaden gebeuren, voornamelijk diefstal, wat het grootste deel van het aantal gerapporteerde misdaden uitmaakt, is dit juist niet het geval. Toen we onze dataset analyseerden vonden we juist dat over het algemeen hoe rijker een land is, hoe groter het totaal aantal gerapporteerde misdaden is.

Hide code cell source
import pandas as pd
import plotly.express as px
import statsmodels

# Load and process data (same as before)
bank1_df = pd.read_csv("world_bank_definitive.csv")
crime_df = pd.read_csv("europe_crime_definitive_per_100k.csv")

bank_df = bank1_df[bank1_df['Indicator Name'] == "GDP per capita, PPP (constant 2021 international $)"]
bank_df = bank_df.rename(columns={"Value": "GDP per capita, PPP (constant 2021 international $)"})

crime_columns = [col for col in crime_df.columns if col not in ["Country Name", "Year"]]
crime_df["Total Crime Rate per 100k"] = crime_df[crime_columns].sum(axis=1)

merged_df = pd.merge(
    crime_df[["Country Name", "Year", "Total Crime Rate per 100k"]],
    bank_df[["Country Name", "Year", "GDP per capita, PPP (constant 2021 international $)"]],
    on=["Country Name", "Year"]
)

# Create scatter plot with trendline
fig = px.scatter(
    merged_df,
    x="GDP per capita, PPP (constant 2021 international $)",
    y="Total Crime Rate per 100k",
    hover_name="Country Name",
    hover_data={"Year": True},
    trendline="ols",  # Ordinary Least Squares regression line
    title="GDP per Capita vs. Total Crime Rate per 100k with Trendline"
)

fig.update_layout(
    xaxis_title="GDP per capita, PPP (constant 2021 international $)",
    yaxis_title="Total Crime Rate per 100k"
)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 13
     10 bank_df = bank_df.rename(columns={"Value": "GDP per capita, PPP (constant 2021 international $)"})
     12 crime_columns = [col for col in crime_df.columns if col not in ["Country Name", "Year"]]
---> 13 crime_df["Total Crime Rate per 100k"] = crime_df[crime_columns].sum(axis=1)
     15 merged_df = pd.merge(
     16     crime_df[["Country Name", "Year", "Total Crime Rate per 100k"]],
     17     bank_df[["Country Name", "Year", "GDP per capita, PPP (constant 2021 international $)"]],
     18     on=["Country Name", "Year"]
     19 )
     21 # Create scatter plot with trendline

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11670, in DataFrame.sum(self, axis, skipna, numeric_only, min_count, **kwargs)
  11661 @doc(make_doc("sum", ndim=2))
  11662 def sum(
  11663     self,
   (...)
  11668     **kwargs,
  11669 ):
> 11670     result = super().sum(axis, skipna, numeric_only, min_count, **kwargs)
  11671     return result.__finalize__(self, method="sum")

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/generic.py:12506, in NDFrame.sum(self, axis, skipna, numeric_only, min_count, **kwargs)
  12498 def sum(
  12499     self,
  12500     axis: Axis | None = 0,
   (...)
  12504     **kwargs,
  12505 ):
> 12506     return self._min_count_stat_function(
  12507         "sum", nanops.nansum, axis, skipna, numeric_only, min_count, **kwargs
  12508     )

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/generic.py:12489, in NDFrame._min_count_stat_function(self, name, func, axis, skipna, numeric_only, min_count, **kwargs)
  12486 elif axis is lib.no_default:
  12487     axis = 0
> 12489 return self._reduce(
  12490     func,
  12491     name=name,
  12492     axis=axis,
  12493     skipna=skipna,
  12494     numeric_only=numeric_only,
  12495     min_count=min_count,
  12496 )

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11562, in DataFrame._reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
  11558     df = df.T
  11560 # After possibly _get_data and transposing, we are now in the
  11561 #  simple case where we can use BlockManager.reduce
> 11562 res = df._mgr.reduce(blk_func)
  11563 out = df._constructor_from_mgr(res, axes=res.axes).iloc[0]
  11564 if out_dtype is not None and out.dtype != "boolean":

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/internals/managers.py:1500, in BlockManager.reduce(self, func)
   1498 res_blocks: list[Block] = []
   1499 for blk in self.blocks:
-> 1500     nbs = blk.reduce(func)
   1501     res_blocks.extend(nbs)
   1503 index = Index([None])  # placeholder

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/internals/blocks.py:404, in Block.reduce(self, func)
    398 @final
    399 def reduce(self, func) -> list[Block]:
    400     # We will apply the function and reshape the result into a single-row
    401     #  Block with the same mgr_locs; squeezing will be done at a higher level
    402     assert self.ndim == 2
--> 404     result = func(self.values)
    406     if self.values.ndim == 1:
    407         res_values = result

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/frame.py:11481, in DataFrame._reduce.<locals>.blk_func(values, axis)
  11479         return np.array([result])
  11480 else:
> 11481     return op(values, axis=axis, skipna=skipna, **kwds)

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:85, in disallow.__call__.<locals>._f(*args, **kwargs)
     81     raise TypeError(
     82         f"reduction operation '{f_name}' not allowed for this dtype"
     83     )
     84 try:
---> 85     return f(*args, **kwargs)
     86 except ValueError as e:
     87     # we want to transform an object array
     88     # ValueError message to the more typical TypeError
     89     # e.g. this is normally a disallowed function on
     90     # object arrays that contain strings
     91     if is_object_dtype(args[0]):

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:404, in _datetimelike_compat.<locals>.new_func(values, axis, skipna, mask, **kwargs)
    401 if datetimelike and mask is None:
    402     mask = isna(values)
--> 404 result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
    406 if datetimelike:
    407     result = _wrap_results(result, orig_values.dtype, fill_value=iNaT)

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:477, in maybe_operate_rowwise.<locals>.newfunc(values, axis, **kwargs)
    474         results = [func(x, **kwargs) for x in arrs]
    475     return np.array(results)
--> 477 return func(values, axis=axis, **kwargs)

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/pandas/core/nanops.py:646, in nansum(values, axis, skipna, min_count, mask)
    643 elif dtype.kind == "m":
    644     dtype_sum = np.dtype(np.float64)
--> 646 the_sum = values.sum(axis, dtype=dtype_sum)
    647 the_sum = _maybe_null_out(the_sum, axis, mask, values.shape, min_count=min_count)
    649 return the_sum

File ~/miniconda3/envs/jupyterbook/lib/python3.10/site-packages/numpy/_core/_methods.py:52, in _sum(a, axis, dtype, out, keepdims, initial, where)
     50 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
     51          initial=_NoValue, where=True):
---> 52     return umr_sum(a, axis, dtype, out, keepdims, initial, where)

TypeError: can only concatenate str (not "int") to str

Figure 4: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur?

The Second Argument of Your Second Perspective#

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt.

Hide code cell source
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
import pycountry

pio.renderers.default = 'notebook'

# === Load GINI Data ===
gini_df = pd.read_csv("gini_definitive.csv")
gini_df['Year'] = gini_df['Year'].astype(int)

# === Load Theft Data ===
theft_df = pd.read_csv("europe_crime_definitive_absolute.csv")
theft_df.rename(columns={'geo': 'Country Code', 'TIME_PERIOD': 'Year'}, inplace=True)

# Convert ISO-2 to ISO-3
def convert_iso2_to_iso3(code):
    try:
        return pycountry.countries.get(alpha_2=code).alpha_3
    except:
        return None

theft_df['Country Code'] = theft_df['Country Code'].apply(convert_iso2_to_iso3)

# Manual fixes for special regions/countries
manual_fix = {
    'England and Wales': 'GBR',
    'Northern Ireland (UK) (NUTS 2021)': 'GBR',
    'Scotland (NUTS 2021)': 'GBR',
    'Greece': 'GRC',
    'Kosovo*': 'XKX'
}

theft_df['Country Code'] = theft_df.apply(
    lambda row: manual_fix[row['Geopolitical entity (reporting)']] 
    if pd.isnull(row['Country Code']) and row['Geopolitical entity (reporting)'] in manual_fix
    else row['Country Code'],
    axis=1
)

theft_df['Year'] = theft_df['Year'].astype(int)
theft_df['Theft'] = pd.to_numeric(theft_df['Theft'], errors='coerce').fillna(0)

# === Years intersection and max year 2022 ===
years = sorted(list(set(gini_df['Year']).intersection(set(theft_df['Year']))))
years = [year for year in years if year <= 2022]

# Create subplot
fig = make_subplots(
    rows=1, cols=2,
    specs=[[{'type': 'choropleth'}, {'type': 'choropleth'}]],
    subplot_titles=('GINI Index', 'Theft Incidents')
)

# Define color scales
gini_min, gini_max = gini_df['Value'].min(), gini_df['Value'].max()
theft_min, theft_max = theft_df['Theft'].min(), theft_df['Theft'].max()

# Add base traces (Year = first year)
fig.add_trace(
    go.Choropleth(
        locations=gini_df[gini_df['Year'] == years[0]]['Country Code'],
        z=gini_df[gini_df['Year'] == years[0]]['Value'],
        text=gini_df[gini_df['Year'] == years[0]]['Country Name'],
        colorscale='Viridis',
        zmin=gini_min,
        zmax=gini_max,
        colorbar=dict(title='GINI', x=0.45)  # position colorbar left
    ),
    row=1, col=1
)

fig.add_trace(
    go.Choropleth(
        locations=theft_df[theft_df['Year'] == years[0]]['Country Code'],
        z=theft_df[theft_df['Year'] == years[0]]['Theft'],
        text=theft_df[theft_df['Year'] == years[0]]['Geopolitical entity (reporting)'],
        colorscale='Reds',
        zmin=theft_min,
        zmax=theft_max,
        colorbar=dict(title='Theft', x=1.0)  # position colorbar right
    ),
    row=1, col=2
)

# Animation frames
frames = []
for year in years:
    frame = go.Frame(
        data=[
            go.Choropleth(
                locations=gini_df[gini_df['Year'] == year]['Country Code'],
                z=gini_df[gini_df['Year'] == year]['Value'],
                text=gini_df[gini_df['Year'] == year]['Country Name']
            ),
            go.Choropleth(
                locations=theft_df[theft_df['Year'] == year]['Country Code'],
                z=theft_df[theft_df['Year'] == year]['Theft'],
                text=theft_df[theft_df['Year'] == year]['Geopolitical entity (reporting)']
            )
        ],
        name=str(year)
    )
    frames.append(frame)

# Update layout
fig.update_layout(
    title_text='GINI Index and Theft Incidents in Europe per Year',
    title_x=0.5,
    geo=dict(
        showframe=False,
        showcoastlines=True,
        lataxis_range=[30, 72],
        lonaxis_range=[-25, 45],
        projection_type='natural earth'
    ),
    geo2=dict(  # for the 2nd map
        showframe=False,
        showcoastlines=True,
        lataxis_range=[30, 72],
        lonaxis_range=[-25, 45],
        projection_type='natural earth'
    ),
    sliders=[{
        "steps": [{
            "args": [[str(year)], {"frame": {"duration": 500, "redraw": True}, "mode": "immediate"}],
            "label": str(year),
            "method": "animate"
        } for year in years],
        "transition": {"duration": 300},
        "x": 0.1,
        "len": 0.8
    }],
    updatemenus=[{
        "buttons": [{
            "args": [None, {"frame": {"duration": 500, "redraw": True}, "fromcurrent": True}],
            "label": "Play",
            "method": "animate"
        }, {
            "args": [[None], {"frame": {"duration": 0}, "mode": "immediate"}],
            "label": "Pause",
            "method": "animate"
        }],
        "direction": "left",
        "pad": {"r": 10, "t": 70},
        "showactive": False,
        "type": "buttons",
        "x": 0.1,
        "xanchor": "right",
        "y": 0,
        "yanchor": "top"
    }]
)

fig.frames = frames

fig.show()

Figure 5: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

Hide code cell source
import pandas as pd
import plotly.express as px
import random

Figure 6: Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

Reflection#

Curabitur non lacus ex. Maecenas at massa ultricies justo venenatis condimentum sed et eros. Ut vitae iaculis massa. Aenean vitae sagittis nibh. Aliquam pharetra dui suscipit purus dictum rutrum. Donec ultricies odio quis porttitor aliquet. Fusce sed nisl non velit rutrum commodo nec sed magna. Morbi non volutpat mi, cursus pulvinar dolor.

Nam sit amet volutpat sapien. Aenean eu mattis neque. Maecenas eget libero consequat, condimentum nulla luctus, fermentum lectus. Donec at enim sit amet dolor vestibulum faucibus. Vestibulum velit elit, faucibus ut mi sit amet, mollis rutrum eros. Ut ut lacinia ante, eu placerat ligula. Fusce quis convallis purus. Maecenas eget fringilla quam.

Proin ac sapien et lectus tempor dignissim a at arcu. Donec placerat aliquet odio, vel aliquam nibh tempus vel. Pellentesque non velit iaculis, porta metus sed, dictum augue. Aenean tempus gravida ullamcorper. Proin cursus fringilla turpis. Integer id lectus dignissim, ultrices metus vel, dictum quam. Suspendisse augue ligula, vestibulum ac nulla a, porta pharetra leo. Integer et pharetra lacus, in porttitor mauris. Cras sodales metus sit amet enim rhoncus sodales. Etiam orci enim, tincidunt eget arcu vel, gravida scelerisque lacus.

Work Distribution#

Jason richtte zich op het preprocessen van de datasets. Hierna focusde hij zich vooral op het coordineren van de samenwerking en begeleidende tekst voor het datastory.

References#

De Courson, B., Nettle, D. Why do inequality and deprivation produce high crime and low trust?. Sci Rep 11, 1937 (2021). https://doi.org/10.1038/s41598-020-80897-8